Model Selection

Instruction fine-tuning optimization

# Instruction fine-tuning optimization

Gervasio 8b Portuguese Ptpt Decoder

Gervásio 8B PTPT is an open-source decoder model for Portuguese, fine-tuned based on LLaMA 3.1 8B Instruct, with powerful text generation capabilities.

Large Language Model

Transformers Other

Tiiuae.falcon H1 34B Instruct GGUF

Falcon-H1-34B-Instruct is a large language model with 34B parameters, focusing on instruction-following tasks.

Large Language Model

Thedrummer Rivermind Lux 12B V1 GGUF

This is a 12B-parameter large language model, processed with llama.cpp's imatrix quantization, offering multiple quantized versions to accommodate different hardware requirements.

Large Language Model

MN Nyx Chthonia 12B

This is a merged version based on multiple 12B-parameter scale models, integrating 7 different pre-trained language models with distinct characteristics using the model_stock method to enhance comprehensive capabilities.

Large Language Model

mergekit-community

Optimized Qwen2 model based on Unsloth and Huggingface TRL library, achieving 2x inference speed improvement

Large Language Model

Transformers English

Qwq 32B Gptqmodel 4bit Vortex V1

QwQ-32B is a 32B-parameter large language model based on the Qwen2 architecture, processed with 4-bit integer quantization using the GPTQ method, suitable for efficient text generation tasks.

Large Language Model

Safetensors English

Llama 3.1 8B UltraLong 1M Instruct

The Nemotron-UltraLong-8B series is a language model specifically designed for processing ultra-long text sequences, supporting a context window of up to 4 million tokens while maintaining exceptional performance.

Large Language Model

Transformers English

Buddyglassuncensored2025.4

This is a merged model based on Mistral-Small-24B-Instruct-2501, utilizing the DARE TIES fusion method to integrate multiple 24B parameter-scale models.

Large Language Model

Llama Krikri 8B Instruct GGUF

A Greek instruction-tuned large language model based on Llama-3.1-8B, enhancing Greek language capabilities and supporting multilingual tasks

Large Language Model

Progenitor V3.3 LLaMa 70B

This project aims to create a language model with better performance by fusing multiple pre-trained language models of 70B scale. Based on the Llama 3.3 instruction model, the Linear DELLA fusion method is used for model fusion.

Large Language Model

Llama SEA LION V3 8B IT

SEA-LION is a series of large language models pre-trained and instruction fine-tuned for the Southeast Asian region, dedicated to solving multilingual processing problems in this region and providing strong support for natural language processing of Southeast Asian languages.

Large Language Model

Transformers Supports Multiple Languages

Ichigo Llama3.1 S Instruct V0.4

A multimodal language model based on Llama-3 architecture, supporting audio and text input understanding with noise robustness and multi-turn dialogue capabilities

Safetensors English

Llama 3.2 Korean Bllossom 3B

Bllossom-3B is a Korean-English bilingual enhanced version based on meta-llama/Meta-Llama-3.2-3B, trained through full parameter fine-tuning and curated Korean data, fully retaining English capabilities while enhancing Korean language processing.

Large Language Model

Transformers Supports Multiple Languages

Ichigo Llama3.1 S Instruct V0.3 Phase 2

The Ichigo-llama3s series models natively support audio and text input comprehension, based on the Llama-3 architecture, using WhisperVQ as the tokenizer for audio files.

Text-to-Audio English

Llama 3.1 8B Instuct Uz GGUF

This is a statically quantized version based on behbudiy/Llama-3.1-8B-Instuct-Uz, supporting Uzbek and English, suitable for various text generation tasks.

Large Language Model Supports Multiple Languages

Solar Pro Preview Instruct

Solar Pro Preview is an advanced large language model with 22 billion parameters, designed specifically for single-GPU operation and delivering excellent performance.

Large Language Model

Transformers English

EuroLLM-1.7B is the first pre-trained model in the EuroLLM series, with multilingual processing capabilities, capable of understanding and generating text in multiple European and other related languages.

Large Language Model

Transformers Supports Multiple Languages

Mistral Nemo Base 2407 Chatml

Mistral-Nemo-Base-2407 is a 12-billion-parameter generative text pre-training model jointly trained by Mistral AI and NVIDIA, outperforming models of similar or smaller scale.

Large Language Model

Transformers Supports Multiple Languages

Meta Llama 3.1 405B Instruct GGUF

Meta-Llama-3.1-405B-Instruct is a large language model with 405 billion parameters based on the Llama 3.1 architecture, optimized for instruction-following tasks and supporting multiple languages.

Large Language Model Supports Multiple Languages

Meta Llama 3.1 8B Instruct GGUF

Llama-3.1-8B-Instruct is an 8B parameter large language model released by Meta, focusing on instruction-following tasks.

Large Language Model

Llama 3 Instruct 8B SimPO SPPO Iter3 Merge

This is a merged pre-trained language model built on Meta Llama 3, which combines the advantages of the SimPO and SPPO-Iter3 models and is suitable for text generation tasks.

Large Language Model

UCCIX Llama2 13B Instruct

UCCIX-Llama2-13B-Instruct is an Irish-English bilingual large language model, developed based on the Llama 2-13B architecture, with special optimizations for Irish language processing.

Large Language Model

Transformers Supports Multiple Languages

Llama3 8B Cn Rochat V1

A Chinese role-playing optimization model fine-tuned on instructions based on hfl/llama-3-chinese-8b-instruct-v3

Large Language Model

Orca Mini V5 8b Dpo

An 8B parameter model based on the Llama 3 architecture, trained with various DPO datasets, focused on text generation tasks

Large Language Model

Transformers English

Wizardlm 2 7B Abliterated

Ablated version of WizardLM-2-7B, processed with orthogonalization to optimize specific behavioral patterns

Large Language Model

Xgen Mm Phi3 Mini Base R V1

XGen-MM is the latest multimodal large model series developed by Salesforce AI Research. Based on the successful design of BLIP, it achieves a more powerful and superior model architecture through fundamental enhancements.

Transformers English

LLAMA 3 Quantized

The 8-bit quantized version of Meta Llama 3 - 8B Instruct large language model, reducing model size and improving inference speed, suitable for deployment on resource-constrained devices.

Large Language Model

Idefics2 8b Chatty

Idefics2 is an open multimodal model capable of accepting arbitrary sequences of images and text as input and generating text output. The model can answer questions about images, describe visual content, create stories based on multiple images, or function purely as a language model.

Transformers English

Llama 3 Bophades V1 8B

A merged model based on Llama-3-8b, integrating multiple pre-trained language models using the Model Stock method

Large Language Model

Mistral 7B Instruct V0.2 Fp8

Mistral-7B-Instruct-v0.2 model quantized to FP8 precision by FriendliAI, significantly improving inference efficiency while maintaining high accuracy.

Large Language Model

Mixtral Chat 7b

This is a hybrid model created by merging multiple Mistral-7B variant models using the mergekit tool, focusing on text generation tasks.

Large Language Model English

Gemma 2b Mt German To English

A version fine-tuned with German instructions based on Google's Gemma 2B model, exploring the possibility of German-to-English translation through vocabulary expansion.

Machine Translation

Transformers Supports Multiple Languages

Gemma is a lightweight open-source large language model launched by Google. It is built on the same technology as Gemini and is suitable for various text generation tasks.

Large Language Model

Jais 30b Chat V3

Jais-30b-chat-v3 is a large language model fine-tuned on selected Arabic and English Q&A datasets based on Jais-30b-v3. It is optimized for Arabic and English conversations and has the ability to process long contexts of 8000 tokens.

Large Language Model

Gemma is a lightweight open model series launched by Google, built on the same technology as Gemini, suitable for text generation tasks.

Large Language Model

Gemma is a lightweight open-source large language model series launched by Google, built on the technology used to create Gemini models, offering a base version with 2 billion parameters.

Large Language Model

Malayallm 7B Base

MalayaLLM is a language model focused on generative AI for Malayalam, built through continued pre-training on the LLAMA2 model with the addition of approximately 18,000 Malayalam vocabulary entries.

Large Language Model

Transformers Supports Multiple Languages

Mistral 7B Instruct V0.2 Sparsity 20 V0.1

Mistral-7B-Instruct-v0.2 is an instruction-finetuned large language model improved from Mistral-7B-Instruct-v0.1, compressed to 2% sparsity using Wanda pruning method while maintaining competitive performance without retraining.

Large Language Model

Mobilellama 1.4B Chat

MobileLLaMA-1.4B-Chat is a chat model fine-tuned from MobileLLaMA-1.4B-Base, utilizing the ShareGPT dataset for supervised instruction fine-tuning.

Large Language Model

Phi2 Chinese 0.2B

A 200-million-parameter Chinese causal language model based on the Phi2 architecture, supporting text generation tasks

Large Language Model

Transformers Supports Multiple Languages

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase